Mon. Not. R. Astron. Soc. 000, 000-000 (2009) Printed 20 September 2010 (MN lAT^X style file v2.2) 



Figures of Merit for Testing Standard Models: 
Application to Dark Energy Experiments in Cosmology 



o 

(N 
Oh 



o 

u 

43 
6 



> 

(N 

m 

o 
o 



A. Amara* 1 & T. D. Kitchingf 2 

1 Department of Physics, ETH Zurich, Wolfgang- Pauli-Strasse 16, CH-8093Zurich, Switzerland 

2 SUPA, Institute for Astronomy, University of Edinburgh, Royal Observatory Edinburgh, Blackford Hill, EH9 3HJ 



Accepted — . Received — ; in original form 



ABSTRACT 

Given a standard model to test, an experiment can be designed to: (i) measure the 
standard model parameters; (ii) extend the standard model; or (iii) look for evidence 
of deviations from the standard model. To measure (or extend) the standard model, 
the Fisher matrix is widely used in cosmology to predict expected parameter errors 
for future surveys under Gaussian assumptions. In this article, we present a frame- 
work that can be used to design experiments such that it maximises the chance of 
finding a deviation from the standard model. Using a simple illustrative example, 
discussed in the appendix, we show that the optimal experimental configuration can 
depend dramatically on the optimisation approach chosen. We also show some simple 
cosmology calculations, where we study Baryonic Acoustic Oscillation and Supernove 
surveys. In doing so, we also show how external data, such as the positions of the 
CMB peaks measured by WMAP, and theory priors can be included in the analysis. 
In the cosmological cases that we have studied (DETF Stage III), we find that the 
three optimisation approaches yield similar results, which is reassuring and indicates 
that the choice of optimal experiment is fairly robust at this level. However, this may 
not be the case as we move to more ambitious future surveys. 
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In cosmology, the ACDM concordance model has become 
our standard model of the Universe. This model satisfies 
current data and depends on three critical sectors: (i) Dark 
Energy; (ii) Dark Matter; and (iii) Initial Conditions. These 
sectors are linked through our theory of gravity - general 
relativity. Although this model is well defined, the addition 
of each component has typically been done to explain the 
available data rather than arising from some fundamental 
theory of the cosmos. Hence, cosmology is currently in a 
data-driven era, with little known about the fundamental 
nature of dark matter and dark energy. As a result, a signif- 
icant effort is underway in this very active field to build 
experiments to measure and extend our standard model. 
These include KIDS, Pan-STARRlfl DES0, LSST0, JDETvfl 
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and EuchqjQ. In planning such future observations, the ap- 
proach to date has been to optimise the experimental and 
methodological designs to minimise the errors on extended 
parameters. In particular, the dark energy equation of state 
(the ratio of pressure to density of dark energy w(z)) gar- 
ners the most attentions and is typically parameterised in 
terms of a second order Taylor expansion in the scale factor 
or redshift z (e.g. w(z) — wo+w a z/(l + z)). Experiments are 
then designed to measure these equation of state parameters 
to the highes t possible precision. T he dark energy Figure of 
Merit (FoM; lAlbrecht et~afl l2006h which is proportional to 
the area of the error ellipse in the wo-w a plane is widely used 
to gauge performance. Other possible metrics have also been 
suggested, such as the addition of parameters to test for de- 
viations from Einstein gravity or the division of w(z) into 
a large number of redshift slices that can then be used to 
construct principal components through a matri x inversion 
ijAlbrecht et al.ll2009l : iHuterer fc Starkmani r2003). However, 
these two suffer from their own problems. For instance, the 
additional modified gravity parameters may not be strongly 
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motivated and the eigen function decomposition o f w(z) can 
suffer from instabilities (|Kitching fc Amarall2009l ). 

In this article, we present an alternative methodology to be 
applied to experimental design when faced with a standard 
model and no guidance from theory. We show that an exper- 
iment can be designed such that the probability of breaking 
the standard model (finding evidence against the model) can 
be maximised. 

This article is organised as follows. In Section [2] we review 
the alternative approaches to experimental design. We then, 
in Section[3] compare each approach using a simple explana- 
tory model, as well as a cosmological example that studies 
the perform ance of the 'cur r ent' a nd Stage III experiments 
discussed in lAlbrecht et all (|2006l ). We summarise our con- 
clusions in Section |4l 



2 APPROACHES TO EXPERIMENT 
DESIGNING 

When planning an experiment with a standard model (a set 
of parameters) in mind, we can think of three possible ap- 
proaches that we can take. The first is to stay within the 
standard model and to design an experiment that will mea- 
sure the parameters of this model to the highest possible pre- 
cision. The next is to extend the standard model (add extra 
parameters), and ideally this extension would be driven by a 
compelling theoretical framework with clear testable predic- 
tions. Finally, in the absence of any compelling theory, one 
can take a more exploratory approach, where the driving aim 
is to design an experiment with the greatest chance of break- 
ing the standard model. Ideally, this approach would depend 
only on well-founded knowledge, such as today's data, the 
expected error bars of future data and the standard model 
that is being tested. 



2.1 Measuring the Standard Model 



Within a well-specifie d model, the Fisher matrix formalism 
l|Tegmark et al.ll 19971 ) is a well-defined framework for esti- 
mating the errors that a given experiment will have on the 
measurement of the parameters of the model. For an exper- 
iment where the parameters have an effect on the mean, the 
Fisher matrix is defined as 



are then given by the diagonal elements of the parameter 
covariance matrix (Cov), which is given by Gov = F^ 1 . 



dC dC 



AC 2 dO, dQi 



(1) 



where C is some observable signal, AC is the expected 
error for an experiment and O is a vector containing 
the parameters. A cosmology model may include B = 
{(78, f2m> &b, f^Aj n s, h, etc}, where, for instance, the dark 
energy equation of state is assumed to be a cosmological con- 
stant (w(z) = —1). The errors on each of these parameters 



2.2 Extending the Standard Model 

When seeking out new physics, we look for ways of going be- 
yond the standard model. Ideally this would be done through 
the guidance of theory. There are many examples of cases 
where theories have been put to the test by experiments 
based on verifiable predictions. One such example is neutrino 
mass. In the standard model of particle physics, neutrinos 
have zero mass, but the assumption of zero mass is an ad hoc 
choice. A natural and physically motivated extension of this 
model was to add mass to neutrinos (through the lepton mix- 
ing matrix addendum) . Neutrino mass has now been exper- 
iment all y confirmed by a number of particle ph y sics exper- 
imen ts (|Ahmed et al.1 12004| ; lEguchi et ail 120031 ; lAhn et all 
2006), and cosmological experiments s hould be able to con- 
strain this mass to high accuracy (e.g. Refregier et aUl20ld : 



Thomas et ai1l2009l ; iKitching et all 120081 ) 

Extra parameters, ^, can be added to the parameters of the 
standard model, 0. In this case, the Fisher matrix formalism 
can once again be used to estimate the errors on all the 
parameter sets. Here, it becomes useful to decompose the 
matrix as 



F = 



pee 
p*e 



F 



H I' 



(2) 



where the matrix F ee contains the Fisher matrix elements 
for the parameters of the standard model, J 1 ** contains the 
elements for the new model parameters and F 09 contains 
the cross terms. 

This approach has been widely adopted by the cosmological 
community in dark energy studies. In this case, the extra pa- 
rameters are typically added in the form of equation of state 
parameters (the ratio of pressure to density) of dark energy 
(w). However, this is a specific way of thinking about dark 
energy (as a dynamical fluid). Therefore, models that do not 
treat dark energy as a fluid have to work in terms of an 'effec- 
tive' equation of state. A further complexity arises because 
the observed low redshift acceleration that motivates dark 
energy could result from other physics, such as the break- 
down of Einstein gravity on cosmic scales. A move away from 
Einstein gravity may not be well represented by the addi- 
tion of equation of state parameters and may require the 
addition of new parameters that specifically allow for such 
deviations. As a result, these extra dark energy parameters 
do not have a firm theoretical basis but are, in fact, an arbi- 
trary expansion of the equation of state (|Kitching fc Amaral 
120091) . 
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2.3 Breaking the Standard Model 

Here, we introduce a new approach to experimental plan- 
ning, where we explicitly design an experiment to maximise 
the probability of finding a deviation from the standard 
model. This deviation is allowed to come from any part of 
the theory and should not depend on any particular theoret- 
ical extension of the standard model. The robustness of such 
an approach can be achieved by relying on minimal inputs, 
namely: (i) current data; (ii) expected error bars of future 
measurements; and (iii) the standard model that we want 
to test. 

We begin by defining some basic parameters. Let X be a 
data vector containing today's measurements (for instance 
a correlation function). These data points have associated 
errors, a\, which means that the measured data points are 
randomly scattered about T, the data vector that would be 
measured with no measurement error or systematic, i.e. the 
underlying values of the observable as measured with the 
perfect experimenfl The expected error bars of a future 
experiment are ay, which would produce a data vector Y. 
Given today's data, we can calculate the probability of the 
future data, P(Y\X), by marginalising over T, 

P(Y\X) = [ P(Y\T)P{T\X) dT, (3) 



making comparisons between experiments with equal num- 
bers of data points. The x 2 an d reduced \ 2 are > therefore, 
simply scaled versions of each other. In this work, we have 
focused on the expectation value of the minimum x 2 of the 
future data, with the understanding that a x 2 correspond- 
ing to a reduced x 2 significantly larger than one will require 
additional parameters beyond those available in the stan- 
dard model. However, it may be interesting to also consider 
the higher order statistics of the minimum x 2 distribution. 
Along similar lines of thought, our FoM could also be re- 
cast in terms of the probability that a future experiment 
will give a Xmin greater than some threshold value. For the 
work presented here, we use the simplest expression (given 
in equation [4} , but we are continuing to investigate further 
possible expressions of this model breaking FoM. 

Here, we use the maximum likelihood fit to the data (mini- 
mum x 2 )- We have used this frequentist measure, as opposed 
to a Bayesian evidence criteria, because there are no objec- 
tive Bayesian measures in the case of assessing the quality 
of a theoretical fit for a single model, given that a single 
model Bayesian evidence must conclude (through a normal- 
isation of probabilities) that there is 100% evidence for that 
model (see Taylor & Kitching, 2010 for further discussion). 
In general, this XmmO^) measure could be replaced with any 
'goodness of fit' criteria G(Y), where equation [4] optimises 
fit. 



where P(T\X) is the probability of T given today's data 
and P(Y\T) is the probability of the future data given T. 
The integral is performed over all possible T since we do not 
know what T is a priori. 

For each realisation of the future data, there will be an asso- 
ciated best-fit that can be achieved with the standard model. 
We focus here on the Xmin- With the probability distribu- 
tion of future data given current data (P(Y\X)), which, for 
simplicity, we will sometimes also denote using P(Y), we 
can calculate the expectation value of the minimum x 2 by 
integrating over all possible future data vectors: 

(xlin) = j xLn(Y)P(Y) dY. (4) 



3 APPLICATION 

3.1 Illustrative Example 

In Appendix A, we explore the impact of the choice of opti- 
misation metric on a simple illustrative example. We set up 
a system of three data points and 'a standard model' that 
is a straight line with one degree of freedom - the slope of 
the line. What this shows is that the optimal configuration 
of a future experiment can vary drastically and can lead 
to exactly opposite optimisations in some cases depending 
on whether model breaking or standard model extension is 
used. 



A high Xmin means that the standard model is not able to 
give a good fit to future data. Hence, an experiment designer 
who wants to maximise his or her chances of breaking the 
standard model should focus on an experiment configura- 
tion that maximises the expectation value of the minimum 
X 2 ; max[(x m i n )]. Strictly, we should use a quantity that is 
robust to the number of data points (for instance the re- 
duced x 2 )- We avoid such problems in what follows by only 

7 As an example, if X is calculated from the mean of n inde- 
pendent data points and the errors are given by the variance 
{a 2 (X) = cT 2 (X)/n 2 ), then T would be the measure given as n 
goes to infinity in the absence of systematics. We note that, in 
this case, cosmic variance would come from the fact that due to 
a finite Universe the number of independent data points will be 
limited to a finite number. 



The simple model that we set up has a 'pivot point,' where 
the model makes an exact prediction, C(x = 8) = 10. To 
measure the standard model parameter (the slope), assum- 
ing that this model is correct, it is clear that there is no 
sensitivity at this point. Therefore, an optimisation will min- 
imise future error bars away from the pivot point. However, 
in the model breaking mode, it is optimal to place the small- 
est future error bars at the pivot point, since it is here that 
even the slightest deviation from the standard model predic- 
tion would yield proof that the standard model is broken. 
Of course the model breaking paradigm here is a high-risk, 
high-gain approach. If T happens to have the same value 
as that of the pivot point, then this approach would yield 
no extra information. When extending the standard model, 
the optimal configuration is entirely dependent on the exact 
form of the extension. For instance, a clear difference is seen 
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between a standard model that is extended by adding a con- 
stant parameter and one that is expanded with a parabolic 
term about the pivot point, thereby preserving the pivot 
point. 



3.2 Cosmological Application 

We now apply our approach to investigate the planning of 
cosmology surveys. In this work, we focus on some sim- 
ple examples that show how this can be done, with a 
more complete investigation of future surveys to follow 
in late r work. In this exam ple, we focus on supernovea 
(SNe) l)Teemark et al. Il998|) and Barvo n Accoustic Oscil- 
lation (BAO) (e.g. see Rassat et alj|2008l . for discussion). In 
addition, we will show: (i) how external data, in this case the 
CMB peak separation, can be added; (ii) how priors coming 
from theory can be included; and (iii) a simple treatment 
for systematics errors. 
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3.2.1 Survey Configurations 

Due to the computational limits of performing the integral 
shown in equation [4] the dimensionality of which scales with 
the number of data points, we have decided to bin the low 
redshift data (i.e. SNe and BAO) into four redshift bins (i. 
0.1 < 2 < 0.4; ii. 0.4 < z < 0.7; iii. 0.7 < 1.0; and iv. 
1.0 < 2 < 1.3). By fixing the number of redshift bins, and 
therefore the number of degrees of freedom since the stan- 
dard cosmology model is the same for all cases, we are able 
to compare the Xmm values directly. This simplifies the com- 
parison between different survey configurations. For current 
BAO data, we use t he galaxy number counts presented in 
IPercival et ail l|2010l ). This work presented a BAO analy- 
sis of the Sloan Digital Sky Survey Data Release 7 sam- 
ple (DR7). This is composed of roughly 900,000 galaxies 
over 9100 deg 2 in the redshift range z = [0.0, 0.5]. We re- 
binned this data into our four redshift bins which leads to 
the distribution shown in Table Q] For current SNe data , 
we use the Union data presented in lKowalski et all l|2008l) . 
This is a compilation of SNe data coming from a number 
of measurements, including the Supernova Legacy Survey, 
the ESSENCE Survey and supernovae measurements from 
the Hubble Space Telescope (HST). Once again, as with the 
BAO data, we have re-binned this data to match the four 
bins that we use in this paper (see Table [2}. As we will 
discuss in Section 13.2.21 we have also included constraints 
coming from current measurements of the CMB peak sep- 
aration presented in iKomatsu et al.l (120091 ). which uses the 
WMAP data. 

For future surveys, we have decided to focus on a configu- 
ration that illustrates the technique presented here, rather 
than to make concrete recommendations about specific mis- 
sion concepts. The reason for this is that the calculations 
that we present here include a number of simplifications, 
such as using only four redshfit bins. These, we feel, allow 
us to calculate trends and make some statements about the 



Figure 1. Fractional errors on the observed quantities for 'cur- 
rent' (black) and stage III (red) experiments. For the BAO mea- 
surements, these are the errors on the transverse BAO scale from 
Blake et al. For the SNe surveys, the observable is the flux loss 
of the SNe. 



relative merits of broad concept ideas. However, to draw de- 
tailed conclusions on specific mission configurations would 
take further detailed work that we will address in follow up 
publications on this topic. For the future surveys that we 
use to illustrate our method we have relied on the Stage III 
surveys described in lAlbrecht et all (|200fj| ). although many 
of the projects may have evolved since this document was 
released. Once again, we re-bin the Stage III data into our 
four redshift bins (see Tables [T] and [2} . 

For the BAO surveys, we simplify the analysis by only us- 
ing the tangential modes, which is pessimistic, and assume 
no systematics, which is optimistic. Due to these reasons, 
the results below are illustrative, and we do not claim that 
the optimistic and pessimistic approaches cancel out each 
other. We calculat e the errors on BAO scale using the fitting 
function given in iBlake et all (l2006t). which has been im- 
plemented in iCosmo |Refregier et al.l 120081 ; iKitching et al.l 
2009). For the Supernova error calculations , we have used 
the F isher matrix approach outlin ed in iTegmark et alj 
|l997h and iHuterer fc Turner! (| 200ll) and have assum ed a 
systematic c ontrib utions outlined in iKim et all (|2004l ) and 
llshak et all l|2006l ). However, we will also show results with- 
out systematics in order to gauge their impact. 



3.2.2 Including External Data 

In this study, we focus on the potential of future BAO and 
SNe surveys. It is, however, straightforward to include other 
data sets. To do this, we must decide whether to only include 
current measurements (for instance, in the case of the CMB 
to include WMAP data) or try and anticipate the joint im- 
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Area Number Density of Galaxies (n g ) [num/amin 2 ] 

0.1<z<0.4 0.4<z<0.7 0.7<z<1.0 1.0<z<1.3 



Current 


10000 


0.013 


0.00056 


0.0 


0.0 


Stage III WiggleZ 


1000 


0.0 


0.022 


0.089 


0.0 


Stage III BOSS 


10000 


0.014 


0.019 


0.0 


0.0 


Stage III WFMOS 


2000 


0.0 


0.056 


0.22 


0.18 



Table 1 . Parameters of the BAO surveys considered in this study. The current survey is chosen to be close to the B AO survey parameters 
for the SPSS DRL7 dPercival et alj|2010h . The future surveys have been chosen from the Stage III surveys of the Dark Energy Task 
Force report l lAlbrecht et alj|2006h . 







Number of Supernovae (n s ) 






0.1<z<0.4 


0.4<z<0.7 0.7<z<1.0 


1.0<z<1.3 


Current SNe 


51 


107 131 


18 


Stage III SNe 


965 


1940 860 


57 



Table 2 . Parameters of the S upernovae surveys considered in this study. The current survey is chosen to be close to the Union supernovae 
sample jKowal ski et alJl200St) . The future surveys have been chosen from the Stage III surveys of the Dark Energy Task Force report 
llAlbrecht et alj|2006fl . 



pact of future measurement of that probe (for instance, to 
include predictions for Plancljf]). If the latter is desired, then 
the prescription for doing so follows the same logic as that 
used for the BAO and SNe calculations and would increase 
the data vectors (F and X) in equation 2] While conceptu- 
ally simple, adding external data in this way can quickly lead 
to computational challenges, since the dimensionality of the 
integral scales the number of data points. The computation 
time for convergent results can diverge quickly, even using a 
simple Monte-Carlo integration scheme. To solve potential 
problems, we would either need to develop a sophisticated 
Monte-Carlo integration scheme with, for instance, impor- 
tance sampling that is tailor made for this problem or try 
to reduce the number of data points by focusing on specific 
features of the external data that we wish to consider. For 
instance, in the case of the CMB we can consider adding 
the peak position and height information rather than imple- 
menting the full correlation data (C(£)). 

If we only add existing external data, then the calculation 
is greatly simplified, since the dimensionality of the integral 
in equation [4] remains the same. Instead, the external data 
is simply used when calculating the minimum \ 2 ■ I n the 
work presented here, we have included the measured spac- 
ing of the acoustic oscillation peaks of the CMB, I a, which 
depends on the ratio of angular diameter distance to the 
sound horizon at photon decoupling epoch (z*), 



t A = (1 + 2*) 



ttDa(z») 
r s (z») 



(5) 



where Da is the angular diameter distance and r 3 is the 
sound horizon. This peak spac ing has been measure d to be 
I a = 302.1 ± 0.86 for WMAP <|Komatsu et alj|2009h . which 
gives an expression for in equation 66. For the sound 



8 http://www.rssd.esa.int/SA/PLANCK/docs/Bluebook-ESA- 
SCI(2005)l.V2.pdf 



horizon calcula tion, we follow t h e calc ulations presented in 
Appendix A of lParkinson et al.1 (I2007T ). 



3.2.3 Theory Priors and Calculating Probabilities 

We now turn our attention to priors coming from our the- 
ory and how these can bound our results. For example, if 
we impose no knowledge at all about what we expect, then 
the PDFs for each of the data points in equation [3] are in- 
dependent. A simple consequence of this is that the proba- 
bility distribution for future data in bins where no current 
data exists (P(F\X)) will be flat between — oo and oo. In- 
puting this PDF into equation U would lead to a (Xmin) °f 
infinity, which is not fully useful when comparing expected 
performances. One can view this result in two ways. The 
first is that a data purist (i.e. someone who wishes not to 
add any bounds from theory) would conclude that the best 
surveys are those that explore new regions where no mea- 
surements have yet been made. The alternative approach is 
to introduce some expectation from our knowledge of basic 
cosmological theory. Theory priors modify the PDFs of fu- 
ture data by imposing relationships between different data 
points. A simple addition is to impose a link between the 
angular diameter distance and the luminosity distance. 

For the configurations shown in Table Q] we immediately 
see that if we take no guidance from theory then we will be 
driven towards WiggleZ and WFMOS (see Table (TJ, since 
these two surveys will provide BAO measurements at red- 
shifts that are currently not explored by current BAO exper- 
iments and, hence, have an expectation value of minimum x 2 
of infinity. Once again, a data purist may argue that these 
surveys should, therefore, be our top priority. In contrast, 
another simple approach is to rely on the widely accepted 
relationship between angular diameter distance (Da) and 
luminosity distance (Dl) given by 
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D L = (l + z) 2 D A . (6) 2 aWv (w) 

~ <4 + <4 ' 

By explicitly adding this very weak prior from the theory, 

the probability of future data is modified (equation © to The data vector X c can also contain external data for which 

there will not be corresponding future measurements. In this 
case, the data vector enteries that correspond to the external 

P(Y\X) = f P{Y b \Dl)P{Ys\Dl)P{Dl\Xb)P{Dl\X s ) dD L , (7) data have X " = X and °> = With this combined data 
J vector, we then calculate x 2 , 



where Yg and Yg are the data vectors for future surveys 
for BAO and SNe (respectively) and Xb and Xs are the 
data vectors for today's surveys. This PDF, therefore, in- 
cludes a relationship between the SNe measurements and 
the BAO measurements at any given redshift. For what we 
present later, this relationship between distances is the only 
information that we impose from theory. However, a natu- 
ral question is what would happen if the future data were to 
extend to redshifts that are not covered by either the BAO 
or the SNe data? A detailed exploration of this will be pre- 
sented in follow-up work. Nonetheless, here we give a brief 
discussion of the basic principles. Once again, priors from 
theory can be used to impose relationships between differ- 
ent data points, which in turn modify the PDF of the future 
data. In particular, the question raised here would look for 
relationships between data points at different redshifts. This 
can be done by introducing an integral relationship between 
distance (co-moving - D c ) and the Hubble function, H(z), 

where c is the speed of light. Without resorting to the Fried- 
mann equation, which links H(z) to density parameters of 
the matter-energy components of the Universe, we can place 
simple constraints on the functional form of H(z) that can be 
used to compute the probability of future data. For instance, 
an assumption that H(z) is a positive definite function over 
cosmic time would bound the comoving distance at a red- 
shift of Zi to be between the comoving distances at z%—\ and 
i.e. that of the redshifts on either side. Here the in- 
clusion of the CMB, with z ~ 1100, becomes very useful. 
The advantage of this approach is that all knowledge from 
theory, including simple relationships, such as that between 
Dl and Da, must be included explicitly. This then allows us 
to decide explicitly what assumptions should be included. 



u c 

where the sum is over the entries of the data vector. In our 
case, this corresponds to a total of nine data points (BAO 
scale at four redshifts, SNe at four redshifts and the CMB 
peak spacing). For a given choice of cosmology parameters, 
M is the value given by the model. For each integration step, 
we use a minimiser to find the parameters that lead to the 
smallest x 2 value. 

The Stage III surveys will look for deviations from the stan- 
dard ACDM concordance model. We consider the standard 
cosmological model as one with Gaussian initial conditional 
following inflation, with scale-free perturbations (n s —1), 
where spatial curvature is allowed and dark energy is un- 
derstood to come from the cosmological constant A (i.e. 
W = — 1). Since we only consider the distance-redshift mea- 
surements, we are sensitive to the following parameters of 
the mode{3 {^m, ^a, h}. The model breaking approach 
does not rely a adding further parameters beyond these well- 
understood ones and will test how likely it is that future ex- 
periments, based on today's data, would find any deviation 
from ACDM, including, for example, evidence for w ^= — 1. 

We perform the integral in equation [4] over all possible re- 
alisations of the future data, which corresponds to an eight 
dimensional integral (four future BAO and four future SNe) . 
For practical reasons to do with computational feasiblity, we 
use the simple Monte-Carlo integra tion technique out lined in 
section 7.7 of Numerical Methods (|Press et al.ll2007h . Here, 
a multidimensional integral (in our case, equation [4} can be 
expressed as 

J fdV » V(f) ± V^ {P) ~ {f) \ (12) 



3.2.4 Computation of (Xmi») 

For each realisation of the the future data (Y) we calculate 
the weighted average data, which is given by 

_ G-\Y + alX 
Xc - o\+o\ ' (9) 

where X c is the value of the combined data, Y and X are 
the future and current data values, and a are the associated 
errors. The errors on the combined data are 



where the expectation values, denoted by the angular brack- 
ets, can be calculated by randomly sampling the function / 
at positions Xi with 

N-l 
i=0 

9 See lAmara fc RefregieJ l|2004l l: iDesiacques fc Seliakl j201Ch : 
iPillepich et al. I <20ld) for examples of how non-Gaussian initial 
conditions impact observables at low redshifts 

10 We note that there is a weak dependence on Qj, through 2*, 
but we have neglected this here since it has little impact on the 
results and only complicates the calculation. 
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The volume of the parameter space is denoted as V. This 
is set by the bounds of the integral, which we have choosen 
in such a way as to ensure that the integrand is vanishingly 
small at this limit. 



3.2.5 Results 

Performing survey optimisations for future experiments typ- 
ically involves a trade-off between different configurations 
that compete for resources. A classic example is a trade-off 
between the d epth and area of a survey for a fixed expo- 
sure time (see lAmara fc Refregierll2007l . for an example of 
this for weak lensing surveys). Another, more difficult and 
often controversial trade-off study, is to trade-off resources 
between different proposed probes. For instance, if due to 
limited resources it is not possible to support both SNe and 
BAO missions envisioned for stage III. A natural question 
might be - should we invest in one over the other? Or should 
scaled-down versions of each mission be pursued? This is 
a complex issue for a number of reasons, but the model 
breaking figure of merit, along with other FoMs, can help 
guide such decisions by quantifying the likelihood of find- 
ing a deviation from 'the standard cosmological model'. For 
this reason, our first illustrative example focuses on a possi- 
ble trade-off study between SNe and BAO stage III surveys. 
We note again that a thorough treatment of such a trade-off 
is complicated. For instance, quantifying the impact of lim- 
ited resources is significantly more complicated than that of 
limited observation times. We made a number of simplifying 
assumption, so the results stated here are only to illustrate 
the method rather than to offer concrete recommendations 
about one experiment over another. In this spirit, we will 
show results for the full Stage III surveys, as well as for the 
scaled down versions. We do not attempt to make a link be- 
tween the scaled-down versions for a fixed set of resources, 
since this is well beyond the scope of this work. For scaling 
down the surveys, we have decided to fix the distributions in 
redshifts (i.e. the PDF of the number of SNe and galaxies as 
a function of redshift is fixed), and we vary an overall scaling. 
For BAO this corresponds to a change in survey area, and 
for SNe this corresponds to a reduction in the total number 
of SNe. 

In Figure [51 we show the expectation value of the minimum 
X 2 when we consider only some fraction of the area of the 
Stage III surveys shown in Table [T] For instance, for a frac- 
tion of 0.5 we divide the areas of all the BAO missions by a 
factor of 2. The results are shown for different realisations of 
Stage III SNe surveys, where once again the fraction refers 
to the fraction of the total SNe numbers shown in Table 
[2] We see that for a range of SNe stage III configurations 
increasing the area of the BAO survey from 1% to 10% of 
what is expected in stage III has no effect on the expectation 
value of the minimum \ 2 ■ Beyond this, however, we see a 
large increase in (Xmin as t ne area of the BAO surveys is in- 
creased, leading to mean Xmm values that are greater than 
5 (i.e. a reduced \ 2 greater than one) for all survey con- 
figurations with 100% of the DEFT stage III survey area. 
This is true with and without SNe systematics. In Figure 
[3] we show similar results as a fraction of future SNe sur- 



{ X 2 m ) a c (n A )/a IU (n A ) FoM„i/FoM c 

BAO III 5.5 10 2.2 

SNe III 3.5 (4.0) 6 (14) 1.1 (2.6) 

BAO & SNe III 7.0 (8.0) 10 (19) 2.2 (4.4) 



Table 3. Comparison between the model breaking approach 
((x 2 ))i working within the standard model (here we show er- 
rors on Oa in a model with only cosmological constant) and 
DETF FoM (which involved paramctcrising the equation of state 
in terms of wo and w a ) . The numbers in parentheses are when no 
systematics are included for SNe, while the other numbers have 
this systematic included. 



veys. We see linear rise in (Xmin) with the SNe fraction from 
1% to 100% of stage III experiments. Here, the rise is less 
dramatic than in the BAO case, and this suggests that it is 
more likely for discovery to come from the BAO experiment. 
This result can also be seen in Table [3] where we also show 
the comparison with the other figures of merits discussed in 
sections |2. II and 1 2.21 The middle column shows the errors on 
the standard model parameters, in this case the density of A, 
and on the right we show the FoM proposed by the DETF, 
which is pro portional to the area o f the error ellipse in the 
WQ-w a plane IXlbrecht et alJ l|2006h . Reassuringly, all three 
measures show similar trends, which would suggest that the 
simple optimisations done here are reasonably robust and 
the overall information content is increased between exper- 
iments with lower FoM and ones with higher ones. This is 
different from the tradeoff studied in appendix[X] where the 
overall error bars are fixed and the sensitivity in different 
regions (x values) leads to changes in the FoMs. 

Finally, we investigate a simple optimisation where we ex- 
plore the model breaking redshift sensitivity of the Stage III 
surveys. We do this by boosting the performance of the sur- 
veys at a particular redshift by dividing the statistical errors 
at that redshift by a factor of 2. This is not a physically mo- 
tivated optimisation. Instead, it can be thought of as simply 
probing where an improvement would be the most effective. 
The results are shown in Figure [4] The coloured bars show 
the fractional increase in (Xm) f° r the calculations where 
SNe systematics have been included. We see here that im- 
proving the SNe survey in the two lowest redshift bins causes 
a notable increase in the {Xmin)i while improving the SNe 
performance in higher redshift bins has little effect, except 
in the no systematics case. This suggests that to go beyond 
stage III SNe experiments we should focus on improving er- 
rors at low redshifts first, unless we can demonstrate that the 
systematic s level s can be brought below those presented by 
iKim et all l|2004 ) and llshak et al.l (120061 ). For the BAO ex- 
periments, we find a different result. Improving the errors in 
our lowest redshift bin has no effect on (Xmin)- However, we 
see that if the errors in our final redshift bin (0.7 < z < 1.0) 
are improved, then we see the largest rise in (Xmin)- This 
suggests that a BAO experiment beyond stage III should 
aim to make measurements at high redshifts. 
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Figure 2. Expectation value of the minimum \ 2 a $ a function of 
the areas of the stage III BAO surveys. The fraction corresponds 
to the fraction of the full survey areas (shown in Table [TJ used. 
These are shown for three configurations of stage III SNe surveys, 
where only a fraction of the SNe in Table [TJ are used. The solid 
curves include SNe systematics while the dotted curves do not. 
The dashed line shows the \ 2 that would correspond to a reduced 
X 2 of 1. 
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Figure 3. Expectation value of the minimum \ 2 as a function 
of the number if SNe of the stage III surveys. The fraction corre- 
sponds to the fraction of the full survey number (shown in Table 
[TJ used, where the PDF is fixed and only a global fraction is ap- 
plied. These are shown for three configurations of stage III BAO 
survey area where only a fraction of the areas in Table[JJare used. 
The solid curves include SNe systematics while the dotted curves 
do not. The dashed line shows the \ 2 that would correspond to 
a reduced \ 2 of 1. 
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Figure 4. The impact of boosting the performance in one of the 
redshift bins. This is done by reducing the statistical error in the 
relevant bin by a factor of 2. The y-axis shows the ratio of the ex- 
pectation value of the minimum \ 2 of the boosted stage III survey 
relative to the standard stage III survey. The different colours cor- 
responding to which probe have been enhanced; the solid colours 
are when SNe systematics are included, and the dashed lines show 
the results when SNe systematics are eliminated. 



4 CONCLUSIONS 

We have presented a framework in which experimental op- 
timisation can be placed. Given a standard model, one can 
either (i) measure the standard model parameters to high 
precision; (ii) attempt to extend the standard model; or (iii) 
attempt to find deviations from the standard model. 

When designing an experiment to measure or extend the 
standard model, the Fisher matrix formalism can be used. 
We have introduced a framework that can be used to design 
an experiment to have the best chance of finding discrepan- 
cies with the standard model. This framework only depends 
on three sets of information (current data, future expected 
error bars and the standard model). No external assump- 
tions are needed for the calculations, though we have also 
shown how priors from the theory can, if needed, be added. 

By using a simple illustrative example, we find that the 
optimal future experiment configuration can depend very 
strongly on the choice of optimisation metric. In our simple 
model, C = m(x — 8) + 10, the data position x = 8 is a 
pivot point since C(x = 8) = 10. When designing an ex- 
periment to measure the standard model, it is optimal to 
have small errors away from the pivot point. However, when 
designing an experiment to break the model, it is optimal to 
have a small error at the pivot point since any measurement 
of C(x — 8) 10 would provide evidence that the standard 
model was incorrect. When extending the model, the opti- 
misation naturally depends on the exact parameterisation 
of the extension. 

In cosmology we have a standard model, ACDM. A large 
number of experiments have been designed to measure an 
ad hoc extension of this model, parameterisations of the 
dark energy equation of state, to high accuracy. Our rec- 
ommendation here is that future cosmology missions should 
be optimised by using the three approaches we have out- 



Model Breaking 9 



lined above: (i) measure the standard ACDM parameters; 
(ii) measure extended parameters, specifically the equation 
of state parameters, the DETF FoM and the modified grav- 
ity parameter 7; and (iii) calculate the expectation value 
that the experiment will find a deviation from ACDM. We 
calculate quantities in these three regimes for SNe, BAO 
(transverse modes) and the CMB peak positions by focus- 
ing on 'current' and the DETF stage III surveys. Should the 
three quality quantifiers agree, then we can be reassured that 
the optimisation is somewhat robust. For instance, there has 
been some concern that the DETF FoM is biased in favour 
of redshifts. However, in the calculations shown in this pa- 
per, we do not find evidence for this, with the results for the 
DETF FoM being consistent with the other figures that we 
have shown. In the event that the three approaches lead to 
conflicting configurations, the the fact that these measures 
look for distinctly different thing means that we should be 
able to make a choice based on a judgement of the priorities 
of a given experiment. 
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APPENDIX A: ILLUSTRATIVE EXAMPLE 

To illustrate the distinction between the three optimisation 
approaches highlighted in this article, we will present a sim- 
ple worked example. We begin with a standard model where 
the signal C at x depends only on the parameter m (i.e. 
O = {m}). Our standard model is that 

C = m(x - 8) + 10. (Al) 

For this simple example, we also assume that measurements 
can only be made at x = {4, 8, 12}, where today's measure- 
ments have yielded X — {10, 10, 10} with Gaussian errors 
of variance a\ — {1, 1, 1}. This is shown in Fig. I All We will 
assume that future experiments can be built to measure the 
signal at the same x positions as today but that the errors 
on the measurements will be significantly smaller than those 
of today. Specifically, we will assume that the quadratic sum 
of the future errors, over all data points, is <Jy( x ) = 2.01 
(this creates a symmetry between the top left and right cor- 
ners of Figurcs rA2IIA3|[A4l and lA5|) . The global performance 
of the future experiment is, therefore, a little better than the 
current one, and the optimisation process is to decide how to 
optimally distribute the errors among the three data points. 



Figure Al. The system being used to illustrate the available 
optimisation options. For this example, the black points are to- 
day's data and the red lines are examples of our standard model 
that are consistent with today's data. In the top right hand cor- 
ner, an example of the typical size of the error bars in the future 
experiment is shown. 



Al Measuring the Standard Model 

To measure performance of a future experiment, we use the 
Fisher matrix to estimate the errors on the parameter m for 
specific configurations of the errors. This allows us to find 
the optimal configuration of the errors for the purpose of 
measuring m. 

Fig. IA2l shows how the future errors at the x — 4 and x = 8 
points are optimised such that the error on m is minimised. 
It is clear that the optimal configuration is insensitive to 
the error at x = 8. This is understandable since within the 
standard model there is not sensitivity to m at x — 8, so 
there is no gain in placing any measurement at this point. 
The optimal strategy to measure the standard model m is 
then to place small future error bars at either x — 4 or 
x — 12. It is also interesting to note that since the value of 
the standard model at x = 8 is fixed, it is better to have 
one small error on either x — 4 or x = 12 (with the other 
being large) than to distribute the errors between these two 
points. 



A2 Extending the Standard Model 

To extend the model, we first have to decide on a way of ex- 
tending the standard model. We must also decide whether 
to optimise or minimise the errors on the extended param- 
eters - after marginalizing over m - or to minimize both the 
standard and extended parameters simultaneously. 
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Figure A2. The results of an optimisation analysis designed to 
measure m (the only parameter of our standard model) to the 
highest precision possible. The quadratic sum of the errors of the 
three data points, (a^,_ 4 + o~x=s + cr^_ 12 ), has been set to 2.01. 
We see that the minimal errors are achieved for small <r^_ 4 (and 
by symmetry <r^_ 12 ). The fact that the lines are close to vertical 
shows that this optimisation is totally insensitive to the measure- 
ment precision at x=8. This can be understood since x = 8 is a 
pivot in our standard model and therefore offers no information 
within our standard model since it can only have a value of 10. 
For a fixed error at X = 8, we see a clear preference to mimisc the 
errors at either x = 4 or x = 12, which means that it is better to 
have one small error bar than mimising both. 



For illustration, we assume that there are two equally valid 
ways of extending the standard model used here. The first 
is the addition of a quadratic term, 



C = m(x - 8) + 10 +pi(x 



(A2) 



Figure A3. Optimisation for the two parameters of extended 
model 1, C = m(x - 8) + 10 + Pi(x - 8) 2 . The plots show the 
expected marginalised errors on m and pi as a function of possible 
measurement errors at x = 4 and x = 8. As in Fig. IA2I the errors 
at x = 12 are set by fixing the quadratic sum of the errors to 
2.01. For this extended model, we see that we are pushed to a 
configuration with maximum errors at x = 8 for both parameters 
m and pi. As with the standard model, this extended model has 
a pivot point at x = 8 and so the measurment here does not bring 
useful information. Unlike the example shown in Fig. lA2l here the 
errors at both x = 4 and 12 are important since they are both 
needed to distinguish between m and pi (with only one data point 
the two parameters are degenerate). This is why maximising the 
error at x = 8, and hence minimising the quadratic sum at x = 4 
and 12, is preferred. 



and 12 are then minimised. We note that in this example 
both of these data points are needed to distinguish between 
the parabolic and the linear term. 



and the second is the addition of a constant, 

C = m(x-8) + 10 + p 2 . (A3) 

Again the Fisher matrix formalism is used to predict the 
future errors on the model parameters, (m, pi) or (m, p 2 ), 
given a configuration for the future data error bars. 

The results are shown in Figs. IA3I and IA4I We show the 
errors on m (marginalised over pi) and on pi (marginalised 
over m). We could have constructed a figure of merit that 
combines the errors of m and pi, but this is somewhat su- 
perfluous in this illustrative example. 

In Fig. IA31 we show how the errors on m and pi from model 
1 are optimised. In this case, x = 8 is a pivot point of the 
extended model so the optimal strategy is to maximise fu- 
ture errors at x — 8 since the parameters are not sensitive to 
data at this point. The quadratic sum of the errors at x = 4 



In Fig. IA4I we show how the errors on m and p 2 from model 
2 are optimised. In this extended model, x — 8 is no longer 
a pivot point of the model. In fact, a small future error bar 
at x = 8 could measure p 2 very accurately (for a given m) 
because the errors are not degenerate with m at this point. 
Hence, the optimisation places a small error bar at x = 8. 
Next, to accurately measure m, the optimisation tries to 
minimise the errors on one of the two remaining errors in a 
similar way to what happen in Fig. IA2I 



A3 Breaking the Standard Model 

For the Fisher matrix calculations we have made the implicit 
assumption that futu re measurement errors are Gaussian 
l|Tegmark et al.iri997l ). For the model breaking approach, we 
make the same assumptions, namely that the probability of 
T given today's data is given by 
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Figure A4. Similar to Fig. IA3I this shows the optimisation for 
the two parameters of extended model 2, C = m(x — 8) + 10 + P2- 
For the extended parameter p2, we are pushed towards a config- 
uration with minimal errors at x = 8. Because this point is not a 
pivot point of the model, it can be used to directly measure p2- 
For m we see that maximum precision is reached by minimising 
the errors at x = 4 and 8 (or by symetry at x = 8 and 12). This 
is because the data point at x = 8 gives the best measure of p2, 
which is degenerate with m. Once p2 is measured, only one extra 
data point is needed in this model. Hence, either x = 4 or 12 
should be minimised. 



Figure A5. The expectation value of the future Xmin' This ex- 
pectation value must be maximised to have the best chance of 
breaking our standard model. The colour scheme for this plot has 
been chosen such that the best configuration (max((x^ lin ))) is 
purple (dark), which is consistent with Fig. I A2| to IA4I where the 
optimal strategies are also purple (dark). We see that using this 
criterion that the optimal configuration is one that minimises the 
errors at x = 8. This can be understood since any deviation from 
y = 10 at this point cannot be explained within our standard 
model. Given today's data and no guidance from theory, a high 
precision measurement here is, therefore, most likely to break the 
standard model. 



irr A ) = ,xp(- (r 2(T f )2 



(A4) 



where today's data vector is once again, X — {10, 10, 10}, 
and the probability of the future data given T is 



The future \ 2 ls given simply by 



x 2 = E+(^- y *) 2 , 



(A5) 



(A6) 



which for the illustrative standard model used here (equa- 
tion is a minimum for 



E^r 2 (^-8)(io-yQ 
E^r 2 (^-8) 



(A7) 



where the sums are over x = 4, 8, 12. These allow us, for 
the simple model being considered here, to solve equation [4] 
analytically. 



Fig. IA5I shows the result of the model breaking optimisa- 
tion for this illustrative example. To have the best chance of 
breaking this standard model, one should place a very small 
error bar at x = 8. This is understood since x — 8 has a very 
stringent prediction that C(x — 8) = 10, any deviation from 
this prediction would be proof that the standard model was 
incorrect. 
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